HLA-A, -B, -C, -DRB1 and -DQB1 allele and haplotype frequencies in Lebanese and their relatedness to neighboring and distant populations

Background This study examined the origin of present-day Lebanese using high-resolution HLA class I and class II allele and haplotype distributions. The study subjects comprised 152 unrelated individuals, and their HLA class I and class II alleles and two-locus and five-locus haplotypes were compared with those of neighboring and distant communities using genetic distances, neighbor-joining dendrograms, correspondence, and haplotype analyses. HLA class I (A, B, C) and class II (DRB1, DQB1) were genotyped at a high-resolution level by PCR-SSP. Results In total, 76 alleles across the five HLA loci were detected: A*03:01 (17.1%), A*24:02 (16.5%), B*35:01 (25.7%), C*04:01 (25.3%), and C*07:01 (20.7%) were the most frequent class I alleles, while DRB1*11:01 (34.2%) and DQB1*03:01 (43.8%) were the most frequent class II alleles. All pairs of HLA loci were in significant linkage disequilibrium. The most frequent two-locus haplotypes recorded were DRB1*11:01 ~ DQB1*03:01 (30.9%), B*35:01-C*04:01 (20.7%), B*35:01 ~ DRB1*11:01 (13.8%), and A*24:02 ~ B*35:01 (10.3%). Lebanese appear to be closely related to East Mediterranean communities such as Levantines (Palestinians, Syrians, and Jordanians), Turks, Macedonians, and Albanians. However, Lebanese appear to be distinct from North African, Iberian, and Sub-Saharan communities. Conclusions Collectively, this indicates a limited genetic contribution of Arabic-speaking populations (from North Africa or the Arabian Peninsula) and Sub-Saharan communities to the present-day Lebanese gene pool. This confirms the notion that Lebanese population are of mixed East Mediterranean and Asian origin, with a marked European component. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08682-7.


Background
The Human Leukocyte Antigen (HLA) system is among the most polymorphic systems in mammals. As of January 23, 2019, 21,499 (15,586 class I and 5,913 class II) alleles of the HLA genes have been reported, of which the B locus with 5,881 alleles is the most polymorphic (http:// hla. allel es. org). The HLA region lies on the short arm of chromosome 6 (6p21.3) and harbors in excess of 220 genes involved in diverse functions [1]. The presence of hundreds of genes within a 3.6-Mb distance leads to the bulk transmission of haplotypes due to linkage disequilibrium, defined as the nonrandom (preferential) association between alleles of close loci (http:// hla. allel es. org). The HLA genes play a key role in the immune response [1] and the pathogenesis of mostly autoimmune diseases [2][3][4] and are very valuable tools in tracing the Open Access *Correspondence: wassim.almawi@outlook.com history of human migration due to the presence of linkage disequilibrium as well as allelic, genetic, and protein diversity [5,6].
Lebanon is an East Mediterranean country and, with an area of 10,452 km 2 , is a small state in mainland Asia. The location of Lebanon at the crossroads of Asia, Europe, and Africa has contributed to its 5,000-year-old history and resulted in a distinct cultural identity marked by religious and ethnic diversity. Lebanon was home to the Phoenicians, who settled the country for almost 3,000 years but were then subject to a wave of invasion, starting with the Assyrians (the seventh century) invading Phoenicia, followed shortly by the Egyptians and, subsequently, Alexander the Great in the fourth century [7]. Following the division of the Roman Empire into the Western Empire and the Eastern Empire (Byzantium), Lebanon fell under Byzantine rule from 395 to 634 [7,8]. Because of Arab conquest and the capture of Damascus in 635, Lebanon was ruled by the Umayyad (660-750), Abbasid (749-1258), and Fatimid (909-1171) dynasties and, later, by the Ottomans in 1516, who conquered most of present-day Middle East/North Africa until 1918 [7,8]. The limited Crusades between 1096 and 1271 witnessed the introduction of European influence into Greater Syria (Lebanon, Syria, and Palestine) and the enforcement of Christianity in the mountain regions.
The population of Lebanon (est. 6,859,408) comprises descendants of diverse ethnicities who are either indigenous or have invaded and occupied Lebanon over the past six millennia. This linguistic, religious, and racial diversity is associated with significant admixture, making present-day Lebanon a mosaic of interrelated cultures. This paper investigates the HLA profile of the Lebanese population, which is compared to the profiles of neighboring and distant populations. It is the first to examine both class I and class II profiles and the first to identify common five-locus HLA haplotypes in the Lebanese population.

Allelic comparison between the Lebanese and other populations
The differences in the typing methods between the study group and reference populations affected the data presentation, notably the calculation of the SGD and the comparison between the populations. The HLA profiles of the 152 Lebanese participants were compared to those of other Arabic-speaking, Mediterranean, and Sub-Saharan populations using high-and low-resolution HLA data; the latter were included because some reference populations lacked high-resolution data. Using DRB1 and DQB1 allele frequencies, standard genetic distance (SGD) analysis identified three clusters (Fig. 1). The first comprised East Mediterranean (pan-Lebanese, Palestinians, Greeks, Syrians, Cretans, Macedonians, Albanians, and Turks), Italians, Iranians, Iraqi Kurds, and Ashkenazi Jews. The second included Iberians, North Africans, Saudis, French, and Egyptians, while the third comprised Sub-Saharans (Bubi, Mandenka, Mossi, Fulani, and Rimaibe). NJ dendrograms identified three populations using SGD based on HLA-A and -B allele frequencies (Fig. 2). The first included East Mediterranean (pan-Lebanese, Palestinians, Cretans, Macedonians, Albanians, Greeks, and Turks), Iranians, Jordanians, Italians, Iraqi Kurds, and Ashkenazi Jews. The second contained Iberians, North Africans, Saudis, and French, while the third contained Sub-Saharans (Fig. 2).
SGDs between Lebanese and other populations showed an absence of clear discontinuities in terms of the genetic distances between the Lebanese-H and other populations (Supplementary Table 1). Based on the data of A, B, DRB1, and DQB1 loci, SGDs confirmed that Lebanese-H are closer to East Mediterranean than West Mediterranean populations, but distant from Sub-Saharans. SGDs showed that Lebanese-A (8.2 × 10 -3 ), Iraqi Kurds (2.8 × 10 -4 ), Palestinians (4.1 × 10 -4 ), Cretans (4.9 × 10 -4 ), and Turks-A (5.8 × 10 -4 ) have the closest genetic distances to the Lebanese-H. Collectively, this confirms the origin of present-day Lebanese compared to neighboring Mediterranean, Levantine, and European populations.

Class I and class II extended haplotype analysis
Extended class I and class II haplotype analysis using 304 chromosomes from the 152 subjects identified 198 fivelocus haplotypes (Table 3)

EWH test of neutrality
The results of the EWH test for the five HLA loci in the Lebanese are shown in Supplementary Table 4. No significant deviation was found for B (P = 0.203), C (P = 0.073), DRB1 (P = 0.180), or DQB1 (P = 0.102) loci; significant deviation was observed only for the A locus (P = 0.025).
Negative Fnd values were recorded for the analyzed loci, and the homozygosity was lower than expected under selective neutrality. Significant differences were noted between the observed and expected homozygotes for the DRB1 (P = 0.033) and DQB1 (P = 0.019) loci, indicating an overall trend away from the null hypothesis of neutral evolution, suggesting that the allele frequency distributions at all loci were shaped by balancing selection.

Genetic admixture in Lebanese
The estimation of the genetic contribution rates to the Lebanese was performed using A, B, DRB1, and DQB1 loci from parental populations from Italy (Europe), Pakistan (Asia), Morocco (North Africa), and Mossi (Sub-Saharan Africa) ( Table 4). The most notable contribution was seen from Europeans (0.8434 -1.0742), followed by Asians (0.1566 -0.2070). The North African and Sub-Saharan contributions to the Lebanese genetic pool were low, as indicated by the negative value of the admixture coefficient established for Mossi (-0.1117 --0.0273) and Moroccans (-0.2539). Similar results were found regardless of the selected population (Sub-Saharan or North African).

Discussion
Previous reports on the HLA profile of Lebanese focused on class II (DRB1 and DQB1) alleles and haplotype analyses, wherein statistical and anthropological analyses were virtually absent [5,[9][10][11]. This present work used the molecular data of both class I (A, B, C) and class II (DRB1, DQB1) loci in examining the possible origin of present-day Lebanese by analyzing the obtained results from a historical context. Using high-resolution molecular typing, 76 alleles were detected. However, allelic comparison of Lebanese to neighboring and distant  populations was not always useful in view of the scarcity or absence of high-resolution data (six digits), mostly in neighboring populations. This limited the comparison to lower-resolution (four digits) levels.
As their genes are separated by a reduced physical distance (PD) of 0.1 Mb, the C:B (D' = 0.8179) and DRB1:DQB1 (D' = 0.7971) loci pairs had the highest LD values as compared to the C:DQB1 pair, which had the weakest association (D' = 0.3343), resulting from the larger PD separating the C and DQB1 genes, which promotes an increased recombination rate. This was reminiscent of earlier studies, which documented that the D' values are inversely proportional to the PD separating the two loci, as the recombination rate increases with the PD [14,15]. 3) was attributed to the existence of recombination hot spots between specific HLA genes and/or the low levels of polymorphism seen at C and DQB1 loci relative to A and B loci. Furthermore, negative Fnd values were seen for all loci, indicating an overall direction toward balancing selection. This was in agreement with an earlier study documenting balancing selection in A, C, B, DRB1, DQA1, and DQB1 HLA loci, with DQA1 displaying the strongest [16].
Here    Our analysis showed that the Lebanese participants were closely related to East Mediterranean (Turks, Albanians, Macedonians, Greeks, and Cretans), Levantine Arab (Syrians, Jordanians, and Palestinians), and Mesopotamian (Iraqis) populations. This can be explained by the fact that East Mediterranean countries share, with slight differences, a similar history and the same territory [17]. The Eastern Mediterranean Basin was historically characterized by high migratory flow between its sub-regions in all directions and in different periods (Greeks, Romans, and Ottomans). This favored admixture, reduced distances, and homogenized the Great Levant populations. The relatedness between the Levantine Arab populations is attributed to their close geographical proximity, which constituted one territory before the nineteenth-century British and French colonization. It is also attributed to their common ancient Canaanite ancestry, originating from East Africa or the Arabian Peninsula via Egypt in 3300 BC [18] and settling in the Levantine lowlands following the collapse of the Ghassulian civilization in 3800-3350 BC [19].
Based on data from A, B, DRB1, and DQB1 loci, admixture analysis showed that most (up to 84%) of the genetic contribution to the Lebanese gene pool is derived from Europeans, with low genetic contributions from other regions, including the Arabian Peninsula, suggesting a low contribution of Arabs and Sub-Saharans to the Lebanese gene pool. This is in accord with the other analyses carried out in this work. Using high-resolution data, the analysis of the five HLA loci confirmed that the Lebanese are distant from North African (Tunisians, Moroccans, and Algerians), Iberian (Basques, Murcians, and Spaniards), and Arabian Peninsula (Saudis, Kuwaitis, and Emiratis) populations. This suggests a lack of contribution of North African and Arabian Peninsula populations to the gene pool of the Lebanese despite the Phoenicians' invasion and long colonization of North Africa and the Arab conquest of Lebanon from as early as the seventh century, prompting speculation of "elite colonization" [20].

Conclusions
In conclusion, our study based on NJ dendrograms, genetic distances, LD, admixture, and correspondence analyses showed that the Lebanese are related to

Study subjects
The study subjects comprised 152 unrelated healthy Lebanese individuals of both sexes (90 males and 62 females), who were randomly collected from the five provinces and the six major religious groups of Lebanon. These comprised hospital and university staff, blood donors, and volunteers from the community. None of the study particiants suffered from any acute or chronic disease, including neurologic, cardiac, or metabolic diseases, and were not on any medication at the time of specimen collection. The individuals were subjected to HLA class I and class II high-resolution genotyping and phylogenetic calculations. The origins of the other populations included for comparative purposes are detailed in Supplementary

HLA genotyping
The Qiagen mini-spin column extraction kit was used to extract genomic DNA from EDTA-anticoagulated venous blood according to the manufacturer's instructions (Qiagen, Hilden, Germany). Low-resolution HLA-A, HLA-B, HLA-C, HLA-DRB1, and DQB1 typing was performed using generic polymerase chain reaction with sequencespecific primers (PCR-SSP) kits (One Lambda, Thousand Oaks, CA), while high-resolution typing was performed by PCR-SSP using SSP1L (class I) and SSP2L (class II) HLA genotyping kits according to the manufacturer's specifications (Luminex-One Lambda, Canoga Park, CA).